Take Home Exercise 1

To reveal the spatial temporal patterns of monthly cumulative confirmed COVID-19 rate and death rate at sub district level in Jarkata, Indonesia.

Wong Wei Ling www.google.com
09-09-2021

1 Overview

This analysis aims to determine the sub-district with relatively higher number of positive cases rate and death rate by using appropriate thematic mapping technique provided by tmap package. This is done by plotting maps to show the spatial temporal distribution of cumulative positive cases and death rate.

2 Dataset

To perform the analysis, data sets from 2 sources are used:

  1. Cumulative cases are updated daily from https://riwayat-file-covid-19-dki-jakarta-jakartagis.hub.arcgis.com/. A total of 17 .xlsx files are downloaded from March 2020 to July 2021. For files that do not contain NA values, the last day of the month’s file is used i.e. 31/30, else, the next available day is used. Also, as this analysis focuses on sub-district level, the data in the “data” worksheet is used. In particular, the files that I have used are:

    • Standar Kelurahan Data Corona (24 April 2021 Pukul 10.00)
    • Standar Kelurahan Data Corona (26 Desember 2020 Pukul 10.00)
    • Standar Kelurahan Data Corona (26 Juni 2021 Pukul 10.00)
    • Standar Kelurahan Data Corona (27 Februari 2021 Pukul 10.00)
    • Standar Kelurahan Data Corona (27 Maret 2021 Pukul 10.00)
    • Standar Kelurahan Data Corona (29 Mei 2021 Pukul 10.00)
    • Standar Kelurahan Data Corona (30 April 2020 Pukul 09.00)
    • Standar Kelurahan Data Corona (30 Januari 2021 Pukul 10.00)
    • Standar Kelurahan Data Corona (30 Juni 2020 Pukul 09.00)
    • Standar Kelurahan Data Corona (30 November 2020 Pukul 10.00)
    • Standar Kelurahan Data Corona (30 September 2020 Pukul 10.00)
    • Standar Kelurahan Data Corona (31 Agustus 2020 Pukul 10.00)
    • Standar Kelurahan Data Corona (31 Juli 2020 Pukul 09.00)
    • Standar Kelurahan Data Corona (31 Juli 2021 Pukul 10.00)
    • Standar Kelurahan Data Corona (31 Maret 2020 Pukul 08.00)
    • Standar Kelurahan Data Corona (31 MEI 2020 Pukul 09.00)
    • Standar Kelurahan Data Corona (31 Oktober 2020 Pukul 10.00)
  2. A collection of geospatial data in a ESRI shapefile format at different geographical levels is available at https://www.indonesia-geospasial.com/. Only shapefile Batas Desa Provinsi DKI Jakarta at https://www.indonesia-geospasial.com/2020/04/download-shapefile-shp-batas-desa.html will be used.

3 Install and Load Packages

Short explanation for the packages used:

packages <- c('readxl', 'tidyr', 'dplyr', 'readr', 'ggplot2', 'ggpubr', 'sf', 'tmap')
for (p in packages){
  if(!require(p, character.only = T)){
    install.packages(p)
  }
  library(p,character.only = T)
}

4 Data Extraction & Wrangling

4.1 Data Wrangling for Aspatial Data

4.1.1 Create function to remove duplicated rows

filenames_list <- list.files(path= "data/aspatial", full.names=TRUE) 

clean.func <-  function(filename, filenames_list) {
  files <- read_excel(filename, .name_repair = "minimal") 
  files <- files[, !duplicated(colnames(files), fromLast=TRUE)]
}

All <- lapply(filenames_list, clean.func)

4.1.2 Add Date (consists of month and year) column for analysis in later steps

new_col <-c("Apr_2021", "Dec_2020", "Jun_2021", "Feb_2021", "Mar_2021", "May_2021", "Apr_2020", "Jan_2021", "Jun_2020", 
            "Nov_2020", "Sep_2020", "Aug_2020", "Jul_2020", "Jul_2021", "Mar_2020", "May_2020", "Oct_2020")

data_list <- Map(cbind, All, Date = new_col) 

col <- c('Date', 'ID_KEL', 'Nama_provinsi','nama_kota', 'nama_kecamatan', 'nama_kelurahan', 'POSITIF', 'Meninggal')
final <- lapply(data_list,"[", col)

aspatial_final <- do.call(rbind, final)

4.1.3 Check for NA values

print(aspatial_final[rowSums(is.na(aspatial_final)) > 0,])

4.1.4 Remove rows with NA values & ensure that it has been removed

aspatial_final <- drop_na(aspatial_final)
print(aspatial_final[rowSums(is.na(aspatial_final)) > 0,])

4.1.5 Remove rows with incorrect data

aspatial_final <- filter(aspatial_final, ID_KEL!="LUAR DKI JAKARTA")
aspatial_final <- filter(aspatial_final, ID_KEL!="PROSES UPDATE DATA")
aspatial_final <- filter(aspatial_final, ID_KEL!="BELUM DIKETAHUI")

4.1.6 Prepare data for calculation of cumulative positive cases & deaths

aspatial_final <- aspatial_final %>%
  group_by(Date, ID_KEL, Nama_provinsi, nama_kota, nama_kecamatan, nama_kelurahan) %>%
  summarise(`POSITIF` = sum(`POSITIF`), `Meninggal` = sum(`Meninggal`)) %>%
  ungroup() %>%
  pivot_wider(names_from = Date, values_from = c(POSITIF, Meninggal))

4.1.7 Remove rows with NA values & ensure that it has been removed

aspatial_final <- drop_na(aspatial_final)
aspatial_final[rowSums(is.na(aspatial_final)) > 0, ]  

4.1.8 Re-order positive and deaths columns to be sequential

aspatial_final <- aspatial_final[, c(1, 2, 3, 4, 5,
                                     16, 6, 18, 14, 12, 8, 22, 21, 20, 9, 11, 10, 17, 7, 19, 15, 13, 
                                     33, 23, 35, 31, 29, 25, 39, 38, 37, 26, 28, 27, 34, 24, 36, 32, 30)]

4.1.9 Write finalised data as rds format

aspatial_final_rds <- write_rds(aspatial_final, "data/aspatial_final_rds.rds")

4.1.10 Import clean aspatial data

aspatial_final_rds <- read_rds("data/aspatial_final_rds.rds")

4.2 Data Wrangling for Geospatial Data

4.2.1 Import polygon feature data in shapefile format

library(sf)
dki <- st_read(dsn = "data/geospatial", 
                layer = "BATAS_DESA_DESEMBER_2019_DUKCAPIL_DKI_JAKARTA")
Reading layer `BATAS_DESA_DESEMBER_2019_DUKCAPIL_DKI_JAKARTA' from data source `C:\wlwong2018\IS415 Blog\_posts\2021-09-09-take-home-exercise-1\data\geospatial' 
  using driver `ESRI Shapefile'
Simple feature collection with 269 features and 161 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 106.3831 ymin: -6.370815 xmax: 106.9728 ymax: -5.184322
Geodetic CRS:  WGS 84

4.2.2 Check CRS

st_crs(dki)
Coordinate Reference System:
  User input: WGS 84 
  wkt:
GEOGCRS["WGS 84",
    DATUM["World Geodetic System 1984",
        ELLIPSOID["WGS 84",6378137,298.257223563,
            LENGTHUNIT["metre",1]]],
    PRIMEM["Greenwich",0,
        ANGLEUNIT["degree",0.0174532925199433]],
    CS[ellipsoidal,2],
        AXIS["latitude",north,
            ORDER[1],
            ANGLEUNIT["degree",0.0174532925199433]],
        AXIS["longitude",east,
            ORDER[2],
            ANGLEUNIT["degree",0.0174532925199433]],
    ID["EPSG",4326]]

4.2.3 Convert WGS84 to national PCS of Indonesia

dki <- st_transform(dki, 23845)
st_crs(dki)
Coordinate Reference System:
  User input: EPSG:23845 
  wkt:
PROJCRS["DGN95 / Indonesia TM-3 zone 54.1",
    BASEGEOGCRS["DGN95",
        DATUM["Datum Geodesi Nasional 1995",
            ELLIPSOID["WGS 84",6378137,298.257223563,
                LENGTHUNIT["metre",1]]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4755]],
    CONVERSION["Indonesia TM-3 zone 54.1",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",0,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",139.5,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at natural origin",0.9999,
            SCALEUNIT["unity",1],
            ID["EPSG",8805]],
        PARAMETER["False easting",200000,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",1500000,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["easting (X)",east,
            ORDER[1],
            LENGTHUNIT["metre",1]],
        AXIS["northing (Y)",north,
            ORDER[2],
            LENGTHUNIT["metre",1]],
    USAGE[
        SCOPE["Cadastre."],
        AREA["Indonesia - onshore east of 138°E."],
        BBOX[-9.19,138,-1.49,141.01]],
    ID["EPSG",23845]]

4.2.4 Plot map to check for outer islands

tmap_mode('view')
tm_shape(dki) +
  tm_dots(alpha=0.4,
          col="blue",
          size=0.05)

4.2.5 Switch back to plot mode after plotting interactive maps

tmap_mode('plot')

4.2.6 Exclude outer islands, drop NA values, extract necessary fields and ensure that NA values have been dropped

dki <- dki[!dki$KAB_KOTA == "KEPULAUAN SERIBU", ] %>%
       drop_na() %>%
      select(OBJECT_ID, KODE_DESA, DESA, KODE, PROVINSI, KAB_KOTA, KECAMATAN, DESA_KELUR, JUMLAH_PEN)
print(dki[rowSums(is.na(dki)) > 0,])
Simple feature collection with 0 features and 9 fields
Bounding box:  xmin: NA ymin: NA xmax: NA ymax: NA
Projected CRS: DGN95 / Indonesia TM-3 zone 54.1
 [1] OBJECT_ID  KODE_DESA  DESA       KODE       PROVINSI   KAB_KOTA  
 [7] KECAMATAN  DESA_KELUR JUMLAH_PEN geometry  
<0 rows> (or 0-length row.names)

4.2.7 Check features and fields of dki

print(dki)
Simple feature collection with 261 features and 9 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -3644275 ymin: 663887.8 xmax: -3606237 ymax: 701380.1
Projected CRS: DGN95 / Indonesia TM-3 zone 54.1
First 10 features:
   OBJECT_ID  KODE_DESA               DESA   KODE    PROVINSI
1      25477 3173031006          KEAGUNGAN 317303 DKI JAKARTA
2      25478 3173031007             GLODOK 317303 DKI JAKARTA
3      25397 3171031003      HARAPAN MULIA 317103 DKI JAKARTA
4      25400 3171031006       CEMPAKA BARU 317103 DKI JAKARTA
7      25390 3171021001         PASAR BARU 317102 DKI JAKARTA
9      25391 3171021002       KARANG ANYAR 317102 DKI JAKARTA
10     25394 3171021005 MANGGA DUA SELATAN 317102 DKI JAKARTA
12     25386 3171011003       PETOJO UTARA 317101 DKI JAKARTA
13     25403 3171041001              SENEN 317104 DKI JAKARTA
14     25408 3171041006             BUNGUR 317104 DKI JAKARTA
        KAB_KOTA   KECAMATAN         DESA_KELUR JUMLAH_PEN
1  JAKARTA BARAT  TAMAN SARI          KEAGUNGAN      21609
2  JAKARTA BARAT  TAMAN SARI             GLODOK       9069
3  JAKARTA PUSAT   KEMAYORAN      HARAPAN MULIA      29085
4  JAKARTA PUSAT   KEMAYORAN       CEMPAKA BARU      41913
7  JAKARTA PUSAT SAWAH BESAR         PASAR BARU      15793
9  JAKARTA PUSAT SAWAH BESAR       KARANG ANYAR      33383
10 JAKARTA PUSAT SAWAH BESAR MANGGA DUA SELATAN      35906
12 JAKARTA PUSAT      GAMBIR       PETOJO UTARA      21828
13 JAKARTA PUSAT       SENEN              SENEN       8643
14 JAKARTA PUSAT       SENEN             BUNGUR      23001
                         geometry
1  MULTIPOLYGON (((-3626874 69...
2  MULTIPOLYGON (((-3627130 69...
3  MULTIPOLYGON (((-3621251 68...
4  MULTIPOLYGON (((-3620608 69...
7  MULTIPOLYGON (((-3624097 69...
9  MULTIPOLYGON (((-3624785 69...
10 MULTIPOLYGON (((-3624752 69...
12 MULTIPOLYGON (((-3626121 69...
13 MULTIPOLYGON (((-3623189 69...
14 MULTIPOLYGON (((-3622451 69...

4.2.8 Write finalised geospatial data as rds format

dki_final_rds <- write_rds(dki, "data/dki_final_rds.rds")

4.2.9 Import clean geosptial Data

dki_final_rds <- read_rds("data/dki_final_rds.rds")

5 Data Integration & Calculations

5.1 Join apsatial & geospatial data

dki_aspatial <- right_join(aspatial_final_rds, dki_final_rds,
                          by = c("ID_KEL" = "KODE_DESA"))

5.2 Calculate cumulative positive cases and death rate

5.2.1 Calculate total postive cases and death rates for each month

dki_aspatial <- dki_aspatial %>%
  mutate(`TOTAL_POSITIF` = rowSums(.[6:22])) %>%
  mutate(`TOTAL_MENINGGAL` = rowSums(.[23:39]))

5.2.2 Calculate cumulative positive cases and death rate per 10,000

dki_aspatial_rate <- dki_aspatial

dki_aspatial_rate[6:22] <- (dki_aspatial_rate[6:22] / dki_aspatial_rate$JUMLAH_PEN) * 10000 

dki_aspatial_rate[23:39] <- (dki_aspatial_rate[23:39] / dki_aspatial_rate$JUMLAH_PEN) * 10000

5.3 Calculate relative risk

dki_aspatial_risk <- dki_aspatial %>%
  mutate(`RELATIVE_RISK` = (rowSums(.[23:39])*100) / (rowSums(.[23:39])*dki_aspatial$JUMLAH_PEN) )

6 Exploratory Data Analysis (EDA)

6.1 Descriptive statistics

summary(dki_aspatial_rate)
    ID_KEL          Nama_provinsi       nama_kota        
 Length:261         Length:261         Length:261        
 Class :character   Class :character   Class :character  
 Mode  :character   Mode  :character   Mode  :character  
                                                         
                                                         
                                                         
 nama_kecamatan     nama_kelurahan     POSITIF_Mar_2020 
 Length:261         Length:261         Min.   : 0.0000  
 Class :character   Class :character   1st Qu.: 0.0000  
 Mode  :character   Mode  :character   Median : 0.3264  
                                       Mean   : 0.7609  
                                       3rd Qu.: 0.6928  
                                       Max.   :49.8826  
 POSITIF_Apr_2020 POSITIF_May_2020 POSITIF_Jun_2020 
 Min.   : 0.000   Min.   : 0.000   Min.   :  0.000  
 1st Qu.: 1.393   1st Qu.: 2.646   1st Qu.:  4.179  
 Median : 2.220   Median : 4.198   Median :  6.515  
 Mean   : 3.370   Mean   : 5.458   Mean   :  8.728  
 3rd Qu.: 4.016   3rd Qu.: 6.796   3rd Qu.: 10.631  
 Max.   :49.883   Max.   :49.883   Max.   :105.336  
 POSITIF_Jul_2020  POSITIF_Aug_2020  POSITIF_Sep_2020 
 Min.   :  1.518   Min.   :  2.429   Min.   :  6.681  
 1st Qu.:  7.477   1st Qu.: 14.134   1st Qu.: 32.867  
 Median : 10.690   Median : 19.532   Median : 41.312  
 Mean   : 14.002   Mean   : 25.203   Mean   : 51.530  
 3rd Qu.: 15.758   3rd Qu.: 31.381   3rd Qu.: 58.183  
 Max.   :112.243   Max.   :210.492   Max.   :511.658  
 POSITIF_Oct_2020 POSITIF_Nov_2020 POSITIF_Dec_2020 
 Min.   : 13.82   Min.   : 28.85   Min.   :  40.85  
 1st Qu.: 54.17   1st Qu.: 75.33   1st Qu.: 101.13  
 Median : 64.34   Median : 88.97   Median : 117.50  
 Mean   : 76.97   Mean   :103.09   Mean   : 134.74  
 3rd Qu.: 85.28   3rd Qu.:110.95   3rd Qu.: 143.32  
 Max.   :605.57   Max.   :783.68   Max.   :1010.36  
 POSITIF_Jan_2021  POSITIF_Feb_2021  POSITIF_Mar_2021 
 Min.   :  59.22   Min.   :  74.51   Min.   :  81.87  
 1st Qu.: 163.49   1st Qu.: 215.35   1st Qu.: 244.44  
 Median : 192.08   Median : 254.66   Median : 291.74  
 Mean   : 212.31   Mean   : 278.28   Mean   : 315.05  
 3rd Qu.: 229.76   3rd Qu.: 307.30   3rd Qu.: 345.77  
 Max.   :1318.01   Max.   :1628.89   Max.   :1810.23  
 POSITIF_Apr_2021  POSITIF_May_2021  POSITIF_Jun_2021
 Min.   :  89.58   Min.   :  91.76   Min.   : 111.4  
 1st Qu.: 261.89   1st Qu.: 275.06   1st Qu.: 337.1  
 Median : 312.88   Median : 332.16   Median : 398.0  
 Mean   : 338.72   Mean   : 360.74   Mean   : 435.1  
 3rd Qu.: 370.19   3rd Qu.: 391.74   3rd Qu.: 473.9  
 Max.   :1988.34   Max.   :2075.78   Max.   :2590.7  
 POSITIF_Jul_2021 Meninggal_Mar_2020 Meninggal_Apr_2020
 Min.   : 187.3   Min.   :0.00000    Min.   :0.0000    
 1st Qu.: 545.1   1st Qu.:0.00000    1st Qu.:0.0000    
 Median : 658.3   Median :0.00000    Median :0.2074    
 Mean   : 705.3   Mean   :0.08094    Mean   :0.3482    
 3rd Qu.: 779.7   3rd Qu.:0.00000    3rd Qu.:0.4854    
 Max.   :3808.3   Max.   :3.05344    Max.   :6.1069    
 Meninggal_May_2020 Meninggal_Jun_2020 Meninggal_Jul_2020
 Min.   :0.0000     Min.   :0.0000     Min.   :0.0000    
 1st Qu.:0.0000     1st Qu.:0.0000     1st Qu.:0.2155    
 Median :0.3372     Median :0.4144     Median :0.5490    
 Mean   :0.4611     Mean   :0.5579     Mean   :0.6912    
 3rd Qu.:0.6447     3rd Qu.:0.7939     3rd Qu.:0.9707    
 Max.   :6.1069     Max.   :6.1069     Max.   :6.1069    
 Meninggal_Aug_2020 Meninggal_Sep_2020 Meninggal_Oct_2020
 Min.   :0.0000     Min.   :0.0000     Min.   :0.0000    
 1st Qu.:0.3321     1st Qu.:0.5694     1st Qu.:0.9442    
 Median :0.6899     Median :0.9934     Median :1.4868    
 Mean   :0.8827     Mean   :1.2192     Mean   :1.6846    
 3rd Qu.:1.2247     3rd Qu.:1.6270     3rd Qu.:2.2127    
 Max.   :6.1069     Max.   :6.1069     Max.   :6.1069    
 Meninggal_Nov_2020 Meninggal_Dec_2020 Meninggal_Jan_2021
 Min.   :0.000      Min.   :0.000      Min.   : 0.000    
 1st Qu.:1.166      1st Qu.:1.607      1st Qu.: 2.415    
 Median :1.839      Median :2.321      Median : 3.266    
 Mean   :2.022      Mean   :2.480      Mean   : 3.506    
 3rd Qu.:2.639      3rd Qu.:3.136      3rd Qu.: 4.337    
 Max.   :6.107      Max.   :9.160      Max.   :16.192    
 Meninggal_Feb_2021 Meninggal_Mar_2021 Meninggal_Apr_2021
 Min.   : 1.318     Min.   : 1.388     Min.   : 1.679    
 1st Qu.: 3.228     1st Qu.: 3.993     1st Qu.: 4.243    
 Median : 4.411     Median : 5.201     Median : 5.572    
 Mean   : 4.598     Mean   : 5.376     Mean   : 5.679    
 3rd Qu.: 5.541     3rd Qu.: 6.406     3rd Qu.: 6.796    
 Max.   :19.430     Max.   :25.907     Max.   :25.907    
 Meninggal_May_2021 Meninggal_Jun_2021 Meninggal_Jul_2021
 Min.   : 1.964     Min.   : 2.244     Min.   : 3.439    
 1st Qu.: 4.688     1st Qu.: 5.339     1st Qu.: 8.446    
 Median : 6.124     Median : 7.036     Median :10.298    
 Mean   : 6.323     Mean   : 7.145     Mean   :10.602    
 3rd Qu.: 7.534     3rd Qu.: 8.375     3rd Qu.:12.508    
 Max.   :29.145     Max.   :29.145     Max.   :42.098    
   OBJECT_ID         DESA                KODE       
 Min.   :25384   Length:261         Min.   :317101  
 1st Qu.:25449   Class :character   1st Qu.:317204  
 Median :25514   Mode  :character   Median :317308  
 Mean   :25514                      Mean   :317334  
 3rd Qu.:25579                      3rd Qu.:317410  
 Max.   :25644                      Max.   :317510  
   PROVINSI           KAB_KOTA          KECAMATAN        
 Length:261         Length:261         Length:261        
 Class :character   Class :character   Class :character  
 Mode  :character   Mode  :character   Mode  :character  
                                                         
                                                         
                                                         
  DESA_KELUR          JUMLAH_PEN              geometry  
 Length:261         Min.   :  3088   MULTIPOLYGON :261  
 Class :character   1st Qu.: 26177   epsg:23845   :  0  
 Mode  :character   Median : 38845   +proj=tmer...:  0  
                    Mean   : 42082                      
                    3rd Qu.: 52424                      
                    Max.   :167523                      
 TOTAL_POSITIF   TOTAL_MENINGGAL
 Min.   :  967   Min.   :  8    
 1st Qu.: 7588   1st Qu.:129    
 Median :10833   Median :192    
 Mean   :11371   Mean   :207    
 3rd Qu.:14503   3rd Qu.:263    
 Max.   :27221   Max.   :729    



Descriptive statistics for Mar 2020 positive cases

summary(dki_aspatial_rate$POSITIF_Mar_2020)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 0.0000  0.0000  0.3264  0.7609  0.6928 49.8826 

Descriptive statistics for Jul 2021 positive cases

summary(dki_aspatial_rate$POSITIF_Jul_2021)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  187.3   545.1   658.3   705.3   779.7  3808.3 



Descriptive statistics for Mar 2020 death rates

summary(dki_aspatial_rate$Meninggal_Mar_2020)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
0.00000 0.00000 0.00000 0.08094 0.00000 3.05344 

Descriptive statistics for Jul 2021 death rates

summary(dki_aspatial_rate$Meninggal_Jul_2021)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  3.439   8.446  10.298  10.602  12.508  42.098 

6.2 Time-series Box Plot

6.2.1 Cumulative Positive Cases

boxplot_1 <- ggplot(data=dki_aspatial,
  aes(x = "", y = POSITIF_Mar_2020)) +
  geom_boxplot()

boxplot_2 <- ggplot(data=dki_aspatial,
  aes(x = "", y = POSITIF_Dec_2020)) +
  geom_boxplot()


boxplot_3 <- ggplot(data=dki_aspatial,
  aes(x = "", y = POSITIF_Jul_2021)) +
  geom_boxplot()

ggarrange(boxplot_1, boxplot_2, boxplot_3, nrow = 1)

6.2.2 Cumulative Death

boxplot_1_m <- ggplot(data=dki_aspatial,
  aes(x = "", y = Meninggal_Mar_2020)) +
  geom_boxplot()

boxplot_2_m <- ggplot(data=dki_aspatial,
  aes(x = "", y = Meninggal_Dec_2020)) +
  geom_boxplot()

boxplot_3_m <- ggplot(data=dki_aspatial,
  aes(x = "", y = Meninggal_Jul_2021)) +
  geom_boxplot()

ggarrange(boxplot_1_m, boxplot_2_m, boxplot_3_m, nrow = 1)

7 Thematic Mapping

Choropleth maps would be used to better understand the spatial temporal distribution of the monthly cumulative positive cases rate and cumulative death rates at sub district level.

dki_aspatial <- st_as_sf(dki_aspatial)
dki_aspatial_rate <- st_as_sf(dki_aspatial_rate)
dki_aspatial_risk <- st_as_sf(dki_aspatial_risk)

7.1 Choropleth Map with Custom Break

7.1.1 Positive Cases Rate

7.1.1.1 Function for custom break & plot choropleth map using function

custom_map_pos <- function(vnam, df, legtitle=NA){
  tm_shape(dki_aspatial_rate) +
     tm_fill(vnam,
             title=legtitle,
             n = 5,
             breaks = c(0, 476, 952, 1428, 1904, 2380, 2856, 3332, 3808),
             palette="Reds")  +
  tm_borders() +
  tm_layout(main.title = vnam,
            main.title.position = c("center","top"),
            main.title.size = 0.7,
            frame = FALSE,
            legend.show = FALSE)
}

tmap_arrange(custom_map_pos("POSITIF_Mar_2020", dki_aspatial_rate),
             custom_map_pos("POSITIF_Apr_2020", dki_aspatial_rate),
             custom_map_pos("POSITIF_May_2020", dki_aspatial_rate),
             custom_map_pos("POSITIF_Jun_2020", dki_aspatial_rate),
             custom_map_pos("POSITIF_Jul_2020", dki_aspatial_rate),
             custom_map_pos("POSITIF_Aug_2020", dki_aspatial_rate),
             custom_map_pos("POSITIF_Sep_2020", dki_aspatial_rate),
             custom_map_pos("POSITIF_Oct_2020", dki_aspatial_rate),
             custom_map_pos("POSITIF_Nov_2020", dki_aspatial_rate),
             custom_map_pos("POSITIF_Dec_2020", dki_aspatial_rate),
             custom_map_pos("POSITIF_Jan_2021", dki_aspatial_rate),
             custom_map_pos("POSITIF_Feb_2021", dki_aspatial_rate),
             custom_map_pos("POSITIF_Mar_2021", dki_aspatial_rate),
             custom_map_pos("POSITIF_Apr_2021", dki_aspatial_rate),
             custom_map_pos("POSITIF_May_2021", dki_aspatial_rate),
             custom_map_pos("POSITIF_Jun_2021", dki_aspatial_rate),
             custom_map_pos("POSITIF_Jul_2021", dki_aspatial_rate))

7.1.2 Death Rates

7.1.2.1 Function for custom break & plot choropleth map using function

custom_map_death <- function(vnam, df, legtitle=NA){
  tm_shape(dki_aspatial_rate) +
     tm_fill(vnam,
             title=legtitle,
             n = 5,
             breaks = c(0, 6, 12, 18, 24, 30, 36, 42),
             palette="Reds")  +
  tm_borders() +
  tm_layout(main.title = vnam,
            main.title.position = c("center","top"),
            main.title.size = 0.7,
            frame = FALSE,
            legend.show = FALSE)
}

tmap_arrange(custom_map_death("Meninggal_Mar_2020", dki_aspatial_rate),
             custom_map_death("Meninggal_Apr_2020", dki_aspatial_rate),
             custom_map_death("Meninggal_May_2020", dki_aspatial_rate),
             custom_map_death("Meninggal_Jun_2020", dki_aspatial_rate),
             custom_map_death("Meninggal_Jul_2020", dki_aspatial_rate),
             custom_map_death("Meninggal_Aug_2020", dki_aspatial_rate),
             custom_map_death("Meninggal_Sep_2020", dki_aspatial_rate),
             custom_map_death("Meninggal_Oct_2020", dki_aspatial_rate),
             custom_map_death("Meninggal_Nov_2020", dki_aspatial_rate),
             custom_map_death("Meninggal_Dec_2020", dki_aspatial_rate),
             custom_map_death("Meninggal_Jan_2021", dki_aspatial_rate),
             custom_map_death("Meninggal_Feb_2021", dki_aspatial_rate),
             custom_map_death("Meninggal_Mar_2021", dki_aspatial_rate),
             custom_map_death("Meninggal_Apr_2021", dki_aspatial_rate),
             custom_map_death("Meninggal_May_2021", dki_aspatial_rate),
             custom_map_death("Meninggal_Jun_2021", dki_aspatial_rate),
             custom_map_death("Meninggal_Jul_2021", dki_aspatial_rate))


7.2 Choropleth Map with Jenks Data Classification

7.2.1 Function for Jenks Data Classification

jenksmap <- function(vnam, df, legtitle=NA){
  tm_shape(dki_aspatial_rate) +
     tm_fill(vnam,
             title=legtitle,
             n = 5,
             style = "jenks",
             palette="Reds")  +
  tm_borders() +
  tm_layout(main.title = vnam,
            main.title.position = c("center","top"),
            main.title.size = 0.7,
            frame = FALSE,
            legend.show = FALSE)
}

7.2.2 Plot choropleth maps with Jenks Data Classification (Positive Cases Rate)

tmap_arrange(jenksmap("POSITIF_Mar_2020", dki_aspatial_rate),
             jenksmap("POSITIF_Apr_2020", dki_aspatial_rate),
             jenksmap("POSITIF_May_2020", dki_aspatial_rate),
             jenksmap("POSITIF_Jun_2020", dki_aspatial_rate),
             jenksmap("POSITIF_Jul_2020", dki_aspatial_rate),
             jenksmap("POSITIF_Aug_2020", dki_aspatial_rate),
             jenksmap("POSITIF_Sep_2020", dki_aspatial_rate),
             jenksmap("POSITIF_Oct_2020", dki_aspatial_rate),
             jenksmap("POSITIF_Nov_2020", dki_aspatial_rate),
             jenksmap("POSITIF_Dec_2020", dki_aspatial_rate),
             jenksmap("POSITIF_Jan_2021", dki_aspatial_rate),
             jenksmap("POSITIF_Feb_2021", dki_aspatial_rate),
             jenksmap("POSITIF_Mar_2021", dki_aspatial_rate),
             jenksmap("POSITIF_Apr_2021", dki_aspatial_rate),
             jenksmap("POSITIF_May_2021", dki_aspatial_rate),
             jenksmap("POSITIF_Jun_2021", dki_aspatial_rate),
             jenksmap("POSITIF_Jul_2021", dki_aspatial_rate))

7.2.3 Plot choropleth maps with Jenks Data Classification (Death Rate)

tmap_arrange(jenksmap("Meninggal_Mar_2020", dki_aspatial_rate),
             jenksmap("Meninggal_Apr_2020", dki_aspatial_rate),
             jenksmap("Meninggal_May_2020", dki_aspatial_rate),
             jenksmap("Meninggal_Jun_2020", dki_aspatial_rate),
             jenksmap("Meninggal_Jul_2020", dki_aspatial_rate),
             jenksmap("Meninggal_Aug_2020", dki_aspatial_rate),
             jenksmap("Meninggal_Sep_2020", dki_aspatial_rate),
             jenksmap("Meninggal_Oct_2020", dki_aspatial_rate),
             jenksmap("Meninggal_Nov_2020", dki_aspatial_rate),
             jenksmap("Meninggal_Dec_2020", dki_aspatial_rate),
             jenksmap("Meninggal_Jan_2021", dki_aspatial_rate),
             jenksmap("Meninggal_Feb_2021", dki_aspatial_rate),
             jenksmap("Meninggal_Mar_2021", dki_aspatial_rate),
             jenksmap("Meninggal_Apr_2021", dki_aspatial_rate),
             jenksmap("Meninggal_May_2021", dki_aspatial_rate),
             jenksmap("Meninggal_Jun_2021", dki_aspatial_rate),
             jenksmap("Meninggal_Jul_2021", dki_aspatial_rate))


8 Analytical Mapping

  1. Box maps will be used to identify sub districts with outliers of cumulative positive cases rates and death rates.

    • The groups specified are: “Lower outlier”, “< 25%”, “25% - 50%”, “50% - 75%”,“> 75%”, “Upper outlier”. Values would be considered as outliers if they are 1.5 times higher than the interquartile range. With box maps, significant spatial patterns are more likely to be revealed.
  2. Relative risk map will also be plotted to compare the observed mortality to the expected mortality.

8.1 Functions needed to plot box maps

# Extract variable as vector out of sf dataframe
get.var <- function(vname, df) {
  v <- df[vname]
  v <- unname(v[[1]])
  return(v)
}

# To create break points for box map
boxbreaks <- function(v, mult=1.5) {
  qv <- unname(quantile(v))
  iqr <- qv[4] - qv[2]
  upfence <- qv[4] + mult * iqr
  lofence <- qv[2] - mult * iqr

  # initialize break points vector
  bb <- vector(mode="numeric",length=7)

  # logic for lower and upper fences
  if (lofence < qv[1]) { # no lower outliers
    bb[1] <- lofence
    bb[2] <- floor(qv[1])
  } else {
    bb[2] <- lofence
    bb[1] <- qv[1]
  }
  if (upfence > qv[5]) { # no upper outliers
    bb[7] <- upfence
    bb[6] <- ceiling(qv[5])
  } else {
    bb[6] <- upfence
    bb[7] <- qv[5]
  }
  bb[3:5] <- qv[2:4]
  return(bb)
}


# Boxmap function
boxmap <- function(vnam, df, mtitle, legtitle=NA, mult=1.5, palette='Reds') {
  df1 <- drop_na(df)
  var <- get.var(vnam,df1)
  bb <- boxbreaks(var)
  tm_shape(df) +
    tm_fill(vnam,
            title=legtitle,
            breaks=bb,
            palette=palette,
            labels = c("Lower outlier", "< 25%", "25% - 50%", "50% - 75%","> 75%", "Upper outlier")) +
  tm_borders(lwd=0.1, alpha=1) +
  tm_layout(main.title = vnam,
            main.title.position = 'center',
            main.title.size = 0.7,
            frame = FALSE,
            legend.show = FALSE) +
  tm_scale_bar(width = 0.15)
}

8.2 Test boxbreaks function

var <- get.var("Meninggal_Mar_2020", dki_aspatial_rate)
boxbreaks(var)
[1] 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 3.053435

8.3 Time-series box map of cumulative positive cases rates

tmap_arrange(boxmap("POSITIF_Mar_2020", dki_aspatial_rate),
            boxmap("POSITIF_Apr_2020", dki_aspatial_rate),
            boxmap("POSITIF_May_2020", dki_aspatial_rate),
            boxmap("POSITIF_Jun_2020", dki_aspatial_rate),
            boxmap("POSITIF_Jul_2020", dki_aspatial_rate),
            boxmap("POSITIF_Aug_2020", dki_aspatial_rate),
            boxmap("POSITIF_Sep_2020", dki_aspatial_rate),
            boxmap("POSITIF_Oct_2020", dki_aspatial_rate),
            boxmap("POSITIF_Nov_2020", dki_aspatial_rate),
            boxmap("POSITIF_Dec_2020", dki_aspatial_rate),
            boxmap("POSITIF_Jan_2021", dki_aspatial_rate),
            boxmap("POSITIF_Feb_2021", dki_aspatial_rate),
            boxmap("POSITIF_Mar_2021", dki_aspatial_rate),
            boxmap("POSITIF_Apr_2021", dki_aspatial_rate),
            boxmap("POSITIF_May_2021", dki_aspatial_rate),
            boxmap("POSITIF_Jun_2021", dki_aspatial_rate),
            boxmap("POSITIF_Jul_2021", dki_aspatial_rate))

8.4 Time-series box map of cumulative death rates

tmap_arrange(boxmap("Meninggal_Apr_2020", dki_aspatial_rate),
            boxmap("Meninggal_May_2020", dki_aspatial_rate),
            boxmap("Meninggal_Jun_2020", dki_aspatial_rate),
            boxmap("Meninggal_Jul_2020", dki_aspatial_rate),
            boxmap("Meninggal_Aug_2020", dki_aspatial_rate),
            boxmap("Meninggal_Sep_2020", dki_aspatial_rate),
            boxmap("Meninggal_Oct_2020", dki_aspatial_rate),
            boxmap("Meninggal_Nov_2020", dki_aspatial_rate),
            boxmap("Meninggal_Dec_2020", dki_aspatial_rate),
            boxmap("Meninggal_Jan_2021", dki_aspatial_rate),
            boxmap("Meninggal_Feb_2021", dki_aspatial_rate),
            boxmap("Meninggal_Mar_2021", dki_aspatial_rate),
            boxmap("Meninggal_Apr_2021", dki_aspatial_rate),
            boxmap("Meninggal_May_2021", dki_aspatial_rate),
            boxmap("Meninggal_Jun_2021", dki_aspatial_rate),
            boxmap("Meninggal_Jul_2021", dki_aspatial_rate))

8.5 Relative Risk map vs. Total Deaths map for sub-districts

relative_risk_map <- tm_shape(dki_aspatial_risk) +
  tm_fill(col = "RELATIVE_RISK", n = 5, style = "jenks", title = "Relative Risk") +
   tm_layout(main.title = "Relative Risk",
             main.title.position = "center",
             main.title.size = 1.2,
            legend.height = 0.27,
            legend.width = 0.35) + 
  tm_borders(alpha = 0.5) + 
tm_credits("Source: Open Data Covid-19 Provinsi DKI Jakarta\n and PODES 2019",
           position = c("left", "bottom"))


total_death_map <- tm_shape(dki_aspatial)+
  tm_fill(col="TOTAL_MENINGGAL", style="jenks", n = 5, title= "Total Deaths" ) +
   tm_layout(main.title = "Total Deaths",
             main.title.position = "center",
             main.title.size = 1.2,
            legend.height = 0.27,
            legend.width = 0.35) +
  tm_borders(alpha = 0.5) +
 tm_credits("Source: Open Data Covid-19 Provinsi DKI Jakarta\n and PODES 2019",
           position = c("left", "bottom"))

tmap_arrange(relative_risk_map, total_death_map, asp=1, ncol=2)

9 Conclusion

10 References